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22883 103.1046.01 

PATENT TRADEMARK OFFICE 

i This application is submitted in the name of the following inventor(s): 

2 

3 Inventor Citizenship Residence City and State 

4 Alan ROWE United Kingdom San Jose, California 

5 

6 The assignee is Network Appliance, Inc., a California corporation having 

7 an office at 495 East Java Drive, Sunnyvale, CA 94089. 

8 

9 Title of Invention 

10 

n A Mechanism to Survive Server Failures When Using The CIFS Protocol 

12 

13 Background of the Invention 

14 

is L Field of the invention 

16 

17 This invention relates to transparent recovery of server failures and elective 

is reboots while maintaining consistent data using the CIFS Filesystem protocol. 

19 

20 2, Related Art 

21 

22 The Common Internet File system (CIFS) protocol is defined by Microsoft. 

23 It enables collaboration on the internet by defining a remote file access protocol that 

24 allows applications to share data on local disks and network file servers. CIFS 

25 incorporates the same high-performance, multi-user read and write operations, locking, 

26 and file-sharing semantics that are the backbone of today's sophisticated enterprise 
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computer networks. With CIFS, users with different platforms and computers can share 
files without having to install new software. 

CIFS generally runs over TCP/IP, and uses the SMB (Server Message 
Block) protocol found in Microsoft Windows® for file and printer access; therefore, 
CIFS will allow all PC applications, not just Web browsers, to open and share files across 
the Internet. 

With CIFS, both the client and the server maintain state about filenames, 
file contents, directories, and various other aspects of the files and directories; thus CIFS 
is a "stateful" protocol. File content is cached via a cooperative process between client 
and server code, and this is where problems can occur. The state survives only as long as 
the session between the server and the client survives, and this session survives only as 
long as the underlying network connection (generally TCP/IP) survives. 

When a server that is currently supporting one or more sessions fails or has 
to be purposefully rebooted, all sessions being supported are lost. CIFS has no protocol 
for re-establishment of a session after such a fatal error, or for synchronization of the 
client/server state to the pre-crash state. CIFS does support fault tolerance in the face of 
network and server failures where some CIFS clients can restore connections and reopen 
files that were open prior to interruption, however, any data that was currently being 
edited that had not been saved is lost. As a result, a server failure is regarded as a 
catastrophic event in the CIFS world. 

Accordingly, it would be advantageous to provide a technique that 
addresses reestablishing server client sessions that were utilizing CIFS after a server 
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failure or elective reboot so that operation resumes where it ended prior to server 
unavailability. 

Summary of the Invention 

The invention includes a method and system for re-establishing sessions 
between a server and clients that were using the CIFS protocol. Two types of situations 
may occur. The first type occurs when a system administrator purposefully reboots the 
server or a purposeful takeover occurs in a clustered configuration. For these elective 
reboots of a server a series of tasks are performed; (1) the server stops accepting 
incoming CIFS requests, (2) the server completes processing of active CIFS requests, (3) 
all active CIFS state and networking state is captured in non-volatile storage (CIFS data 
structures are static at this point), (4) the server is rebooted, (5) state is rebuilt for the 
rebooted machine from that which was saved in non-volatile storage; in a takeover 
configuration state is made available through transmission or some form of non-uniform 
memory access, and (6) incoming CIFS requests are once again accepted and operation 
resumes. 

The second type occurs when the server reboots without warning or there is 
an unplanned takeover due to server failure. These unplanned occurrences require the 
following tasks be performed; (1) state is saved persistently at predetermined intervals to 
non-volatile storage, (2) when the system crashes and reboots or is taken over, state is 
restored from the non-volatile storage, or in a takeover configuration, state is made 
available through transmission to a subsequent machine or through some form of non- 
uniform memory access, (3) operations that were in progress resume at the steps they 
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1 were at prior to the crash, (4) new CIFS requests are now accepted. All of the preceding 

2 is transparent to the clients and no data are lost. 

3 
4 

5 Brief Description of the Drawings 

6 

7 Figure 1 illustrates a block diagram of a system for server failure survival 

8 when using the CIFS protocol. 

9 

10 Figure 2 illustrates a file server elective reboot/takeover process in a system for 

11 server failure survival when using the CIFS protocol. 

12 

13 Figure 3 illustrates a file server non-elective reboot/takeover process in a system 

14 for server failure survival when using the CIFS protocol. 

15 

16 Figure 4 illustrates a file server non-elective takeover process in a system to 

n survive server failures when using the CIFS protocol. 

18 

19 Figure 5 illustrates critical state saving points in a mechanism to survive server 

20 failures when using the CIFS protocol. 

21 

22 Detailed Description of the Preferred Embodiment 

23 

24 In the following description, a preferred embodiment of the invention is 

25 described with regard to preferred process steps and data structures. Embodiment of the 

26 invention can be implemented using general purpose processors or special purpose 
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processors operating under program control, or other circuits, adapted to particular 
process steps and data structures described herein. Implementation of the process steps 
and data structures described herein would not require undue experimentation or further 
investigation. 

Lexicography 

The following terms refer to or relate to aspects of the invention as described 
below. The descriptions of general meanings of these terms are not intended to be 
limiting, only illustrative. 

• client and server - These terms refer to a relationship between two devices, 
particularly to their relationship as a client and server, not necessarily to any particular 
physical devices. 

• Client device and server device - These terms refer to devices taking on the role of a 
client device or a server device in a client-server relationship (such as an HTTP web 
client and web server). There is no particular requirement that any client devices or 
server devices be individual physical devices. They can each be a single device, a set 
of cooperating devices, a portion of a device, or some combination thereof. 

• Procedure - A procedure is a self-consistent sequence of computerized steps that lead 
to a desired result. These steps are defined by one or more computer instructions. 
These steps are performed by a computer executing the instructions that define the 
steps. Thus, the term "procedure" can refer to a sequence of instructions, a sequence 
of instructions organized in a programmed-procedure or programmed- function, or a 
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1 sequence of instructions organized within programmed-processes executing in one or 

2 more computers. 

3 

4 • CIFS - Common Internet File System protocol defines a standard for remote file 

5 access using millions of computers at a time across different platforms that can share 

6 files. 

7 

8 • NetBIOS - An application programming interface (API) that augments the DOS 

9 BIOS by adding special functions for local-area networks (LANs). 

10 

1 1 • NBT - An implementation of Netbios over TCP/IP. 

12 

13 • SMB - Server Message Block. A message format used by DOS and Windows too 
H share files, directories and devices. NetBIOS is based on the SMB format. 

15 

16 As noted above, these descriptions of general meanings of these terms are not 

17 intended to be limiting, only illustrative. Other and further applications of the invention, 

18 including extensions of these terms and concepts, would be clear to those of ordinary 

19 skill in the art after perusing this application. These other and further applications are 

20 part of the scope and spirit of the invention, and would be clear to those of ordinary skill 

21 in the art, without further invention or undue experimentation. 

22 

23 System Elements 

24 

25 Figure 1 shows a block diagram of a mechanism to survive server failures 

26 when using the CIFS protocol. 
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A preferred embodiment of the system 100 can include a client device 110, 
a client communication link 120, a communications network 130, a first server 140, a 
second server 150, a server communications link 160, an interconnect 170, and a mass 
storage 180. 

The client device 110 includes a processor, memory, mass storage (not 
shown but understood by one skilled in the art). Typically, the client device 110 is 
associated with a user. 

The client communication link 120 couples the client device 110 to the 
communications network 130. In a preferred embodiment, the communications network 
130 includes an Internet, intranet, extranet, virtual private network, enterprise network, or 
another form of communication network. 

The first server 140 includes a processor, a main memory (not shown but 
understood by one skilled in the art), and a first non- volatile storage 141. In a preferred 
embodiment the first server 130 and the client device 110 are separate devices, however, 
there is no requirement in any embodiment that they be separate devices. In a preferred 
embodiment, the first non-volatile storage 141 includes any electronic storage medium 
capable of retaining state without power or by some auxiliary power source (such as; 
non-volatile random access memory, magnetic and optical drives). 

The second server 150 includes a processor, a main memory (not shown but 
understood by one skilled in the art), and a second non- volatile storage 151. In a 
preferred embodiment the second server 150 and the client device 110 are separate 
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devices, however, there is no requirement in any embodiment that they be separate 
devices. In a preferred embodiment, the first non- volatile storage 151 includes any 
electronic storage medium capable of retaining state without power or by some auxiliary 
power source (such as; non- volatile random access memory, magnetic and optical 
drives). 

Additionally, the invention is applicable to both a standalone server and a 
server cluster; however, the second server 150 is used only in applications of the 
invention where the functions of the first server 130 are to be taken over by the second 
server 150. There is no requirement in any embodiment that the second server 150 be 
present in non-takeover applications of the invention. 

A server communications link 160 couples the first server 140 and the 
second server 150 to the communication network 130. 

An interconnect 170 couples the first server 140 to the second server 150 
providing bi-directional communication between the two servers. 

The mass storage 180 is coupled to both the first server 140 and the second 
server 150. In a preferred embodiment the mass storage 180 includes magnetic and 
optical disk arrays, and other devices capable of storing relatively large amounts of data. 

Method of Operation - Elective Takeover and Elective Reboot 
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Figure 2 illustrates a file server elective reboot/takeover process, indicated 
by general reference character 200. The file server elective reboot/takeover process 200 
initiates at a 'start' terminal 201. The file server elective reboot/takeover process 200 
continues to a 'boot system 5 procedure 203 which enables the first server 140 to boot. 

A "flag active 5 decision procedure 205 determines whether the first server 
140 is rebooting following an elective reboot. If the 'flag active 5 decision procedure 205 
determines that the first server 140 has been subjected to a reboot, the file server elective 
reboot/takeover process 200 continues to a 'restore state 5 procedure 227. 

A 'receive CIFS requests 5 procedure 207 allows user requests to be 
received by the first server 140. 

A 'process CIFS requests 5 procedure 209 allows the first server 140 to 
respond to requests from the client device 110 by providing access to data contained in 
the mass storage 180. 

An 'initiate elective process? 5 procedure 211 determines whether the 
system is to be purposely taken offline (e.g. by the systems operator for maintenance 
purposes). If the 'initiate elective process? 5 procedure 211 determines that an elective 
shutdown has not been initiated, the file server elective reboot/takeover process 200 
continues to the 'receive CIFS requests 5 procedure 207. 

An 'ignore CIFS requests 5 procedure 213 causes the server device 140 to 
ignore all incoming CIFS requests from the client device 110. This is perceived by the 
client device 110 as a network delay and will not by itself terminate the session. The 
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client device 110 will resubmit CIFS requests until accepted or until the session is timed 
out (approximately 45 - 60 seconds from receipt of the first rejection by the client device 
110) which ever comes first. The invention enables acceptance of CIFS requests prior to 
a session timing out. 

A 'drain CIFS requests' procedure 215 ensures that all currently active 
CIFS requests are processed to completion. 

An 'elective takeover?' decision procedure 217 determines whether an 
elective takeover has been selected by the systems operator. If the 'elective takeover' 
decision procedure 217 determines that an elective take over has been selected by the 
systems operator the file server elective reboot/ takeover process 200 continues to an 
'elective takeover Save State' procedure' 223. 

An 'elective reboot save state' procedure 219 causes the current state of the 
first server 140 to be stored in the first non- volatile storage 141. This includes the setting 
of the flag value to indicate a planned reboot of the first server 140. 

A 'shut down system' procedure 221 causes the first server 140 to be shut 
down. The file server elective reboot/takeover process 200 terminates through an "end" 
terminal 229. 

An 'elective takeover save state' procedure 223 causes the current state of 
the first server 140 to be stored in the first non- volatile storage 141 and the second non- 
volatile storage 151. 
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A 'takeover server restore state' procedure 225 allows the state of the first 
server 140 stored in the first non-volatile storage 141 to be transferred via the 
interconnect 170 and reconstituted on the second server device 150 or procured from the 
second non-volatile storage 151. At this point the second server 150 is supporting the 
sessions that were active on the first server 140 prior to elective takeover and CIFS 
processing within these sessions continues. The file server elective reboot/takeover 
process 200 terminates through an "end" terminal 229. 

A 'restore state' procedure 227 allows the state of the first server 140 to be 
reconstituted to the state it was in prior to an elective reboot or non-elective reboot from 
the state stored in the first non-volatile storage 141. The file server elective 
reboot/takeover process 200 continues to a 'receive CIFS requests' procedure 207. 

Method of Operation - Non-Elective Reboot. 

Figure 3 illustrates a file server non-elective reboot process, indicated by 
general reference character 300. The file server non-elective reboot process 300 initiates 
at a 'start 5 terminal 301. The file server non-elective reboot process 300 continues to a 
'boot system 5 procedure 303 which allows the first server 140 to boot. 

A "flag active' decision procedure 305 determines whether an non-elective 
reboot has occurred. If the 'flag active' decision procedure 305 determines that the a 
non-elective reboot has occurred, the file server non-elective reboot process 300 
continues to the 'resume normal operation 5 procedure 309. 
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A 'restore state' procedure 307 allows the first server 140 to reconstitute 
the state it was in prior to the non-elective reboot by copying state from that stored in the 
first non- volatile storage 141 or second non- volatile storage 151. 

A 'resume normal operation 5 procedure 309 allows the first server 140 to 
once again accept and process CIFS requests and perform all functions it was executing 
prior to the non-elective reboot. 

The file server non-elective reboot process 300 terminates through an "end" 

terminal 311. 

Method of Operation - Non-Elective Takeover. 

Figure 4 illustrates a file server non-elective takeover process, indicated by 
general reference character 4. The file server non-elective takeover process 400 initiates 
at a 'start' terminal 401. The file server non-elective takeover process 400 continues to a 
'boot system 5 procedure 403 which allows the first server 140 to boot. 

A 'receive CIFS requests' procedure 405 allows user requests to be 
received by the first server 140. 

A 'process CIFS requests' procedure 407 allows the first server 140 to 
respond to requests from the client device 110 by providing access to data contained in 
the mass storage 180. 
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A 'save state' procedure 409 allows the state of the first server 140 to be 
saved to the first non- volatile storage 141 and the second non-volatile storage 151. In a 
preferred embodiment, reliably state saving in anticipation of a system failure may be 
performed at any one of a plurality of specific points within the processing of CIFS 
requests. For clarity in the description of this method of operation, 'save state' procedure 
409 is indicated only once. The specific points for saving state are further discussed 
within this application. 

A ' filer 1 failure' decision procedure 41 1 determines whether the first server 
140 has failed in some way. In a preferred embodiment, failure of the first server 140 
would be detected by the second server 150. If the ' filer 1 failure' decision procedure 41 1 
determines that the first server 140 has not failed, the file server non-elective takeover 
process 400 continues to the 'receive CIFS requests' procedure 405. 

A 'restore state' procedure 413 allows the state of the First server 140 prior 
to failure to be reconstitute on the second server 150 by copying state from that stored in 
the first non- volatile storage 141 or second non- volatile storage 151. 

A c filer2 takeover' procedure 415 completes the process by allowing the 
second filer 150 to resume processing of CIFS request where the first server 140 stopped. 

The file server non-elective takeover process 400 terminates through an 
"end" terminal 417. 
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Method of Operation - Automatic State Saving 

Figure 5 illustrates critical state saving points in a mechanism to survive 
server failures when using the CIFS protocol. 

The saved state must always be in a consistent state. Automatic state 
saving must occur at specified points within a session of communication between the first 
server 140 and the client device 1 10 to ensure that the saved state is consistent. 

POINT 1 : State is saved prior to TCP acknowledging an incoming CIFS request. 
If the system fails prior to this, then the effect is as if the packet was never received, and 
retransmission by the client device 110 occurs. If the system fails after the 
acknowledgment is sent, then the system has a record that the request came in and it will 
be processed when state is restored. 

POINT 2: State is saved prior to CIFS starting a SMB command. If the system 
fails prior to this, TCP will redeliver the TCP message to CIFS. If the system fails after 
this, the saved state indicates that the first server 140 started work on a CIFS operation. 
(Some single CIFS commands are composite operations: e.g. open, read, and close. In 
such cases, saving state is required before each component operation). 

POINT 3: State is saved when a CIFS operation completes. If the system fails 
prior to this, the same CIFS operation is repeated creating the same result. If the system 
fails after this, the reply is sent again and TCP treats it as a duplicate. 
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1 POINT 4: State is saved after TCP acknowledges the reply. If the system 

2 fails prior to this, then the reply never happened and will be sent again. If the system 

3 fails after the acknowledgment but before the acknowledgment is saved, then we will 

4 duplicate the acknowledgment and normal TCP handling will process that without any 

5 problems. If the system fails after the save has occurred the acknowledgment will not be 

6 repeated. 

7 

8 These four points illustrate where state may be saved in a consistent 

9 manner, however, there are other points where state may be reliably saved and these 

10 points would be obvious to one skilled in the art. 

n 

12 

13 Generality of the Invention 

14 

15 The invention has general applicability to various fields of use, not necessarily 

16 related to the services described above. For example, these fields of use can include one 
n or more of, or some combination of, the following: 

18 

19 • In addition to general applicability to CIFS the invention has broad applicability to 

20 other transmission protocols. 

21 

22 Other and further applications of the invention, in its most general form, will be clear 

23 to those skilled in the art after perusal of this application, and are within the scope and 

24 spirit of the invention. 

25 
26 



Express Mailing No. EL524780260US 



15 



103.1046.01 

i Alternate Embodiments 

2 

3 Although preferred embodiments are disclosed herein, many variations are 

4 possible which remain within the concept, scope, and spirit of the invention, and these 

5 variations would become clear to those skilled in the art after perusal of this application. 



9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 
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Claims 

1. A method of operating a file server, comprising the steps of: 

receiving a CIFS request; and 

recording state at that time about the request; and 

restoring state upon reboot as last recorded; and 

attempting to continue the CIFS session that the request was part of. 

2. The method of claim 1, wherein said step of receiving a CIFS request also includes 
the steps of 

acknowledging receipt of said CIFS request; and 
processing said CIFS request. 

3. The method of claim 1, wherein said step of recording state includes determining 
automatically whether the processing of a CIFS request is at a point where said state 
can be reliably recorded. 

4. The method of claim 3, wherein said step of recording state occurs at points based on 
the progress of processing of a CIFS request. 

5. The method of claim 4, wherein said state is recorded to a non- volatile storage 

6. The method of claim 1, wherein said step of recording state occurs as part of an 
elective reboot or elective takeover of a server further comprising: 

ignoring current CIFS requests; 
processing all active CIFS requests; and 
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recording state. 

7. The method of claim 6 ? wherein all currently active requests are processed to 
completion. 

8. The method of claim 1, wherein said step of recording state further comprises the 
step of determining whether said server shutdown was elective or non-elective. 

9. The method of claim 8, wherein said step of determining whether said server 
shutdown is elective or non-elective is a function of a flag value stored in said non- 
volatile storage. 

10. The method of claim 9, wherein said flag value indicates said server shutdown was 
elective. 

1 1. The method of claim 9, wherein said flag value indicates said server shutdown was 
non-elective. 

12. The method of claim 1, wherein said step of recording state further comprises the 
step of determining whether recovery will be accomplished by rebooting the affected 
server or takeover by another server. 

1 3 . The method of claim 12, wherein said step of determining whether recovery will be 
accomplished by rebooting the affected server or takeover by another server is a 
function of said flag value stored in said non- volatile storage. 
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14. The method of claim 13, wherein said flag value indicates said recovery will be 
accomplished by rebooting the affected server. 

15. The method of claim 13, wherein said flag value indicates said recovery will be 
accomplished by takeover by another server. 

16. The method of claim 1, wherein said step of restoring state further comprises 
determining whether recovery is by reboot or takeover by another server. 

17. The method of claim 16, wherein said step of determining whether recovery is 
accomplished by reboot or takeover by another server is a function of said flag value 
stored in said non- volatile storage. 

18. The method of claim 17, wherein said reboot comprises the steps of: 

rebooting the affected server's operating system; and 

rebuilding in-memory data structures to the state prior to said reboot. 

19. The method of claim 18, wherein said rebuilding in-memory data structures further 
comprises fetching the state stored in said non- volatile storage to rebuild said in- 
memory data structures. 

20. The method of claim 17, wherein said takeover comprises fetching the state stored in 
the non-volatile storage and rebuilding said in-memory data structures in another 
server using said state. 
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1 2 1 . The method of claim 1 , wherein said step of attempting to continue the CIFS session 

2 that the request was part of further comprises the step of processing the remaining 

3 portion of the uncompleted request. 

4 

5 22. Apparatus including; 

6 means for receiving a CIFS request; and 

7 means for recording state at that time about the request; and 

8 on reboot, restoring state as last recorded; and 

9 means for attempting to continue the CIFS session that the request was part 

10 of. 
11 

12 23. The apparatus of claim 22, wherein said means for receiving a CIFS request includes 

B a means for acknowledging receipt of said CIFS request and a means for processing 

14 the request. 

15 

16 24. The apparatus of claim 22, wherein said means for recording state includes a means 

17 to determine automatically whether the processing of a CIFS request is at a point 
is where said state can be reliably recorded. 

19 

20 25. The apparatus of claim 24, wherein said means for recording state occurs at points 

21 based on the progress of processing of a CIFS request. 

22 

23 26. The apparatus of claim 25, wherein said state is recorded to a non-volatile storage 

24 

25 27. The apparatus of claim 22, wherein said means for recording said state occurs as part 

26 of an elective reboot or elective takeover of a server further comprising: 
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means for ignoring current CIFS requests; 
means for processing all active CIFS requests; and 
means for recording state. 

28. The apparatus of claim 27, wherein all currently active requests are processed to 
completion. 

29. The apparatus of claim 22, wherein said means for recording state further comprises a 
means for determining whether said server shutdown was elective or non-elective. 

30. The apparatus of claim 27, wherein said means for determining whether said server 
shutdown was elective or non-elective is a function of a flag value stored in said non- 
volatile storage. 

31. The apparatus of claim 30, wherein said flag value indicates said server shutdown 
was elective. 

32. The apparatus of claim 30, wherein said flag value indicates said server shutdown 
was non-elective. 

33. The apparatus of claim 22, wherein said means for recording state further comprises a 
means for determining whether recovery will be accomplished by rebooting the 
affected server or takeover by another server. 
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34. The apparatus of claim 33 , wherein said means for determining whether recovery 
will be accomplished by rebooting the affected server or takeover by another server is 
a function of said flag value stored in said non-volatile storage. 

35. The apparatus of claim 34, wherein said flag value indicates said recovery will be 
accomplished by rebooting the affected server. 

36. The apparatus of claim 34, wherein said flag value indicates said recovery will be 
accomplished by takeover by another server. 

37. The apparatus of claim 22, wherein said means for restoring state further comprises 
means for determining whether recovery is by reboot or takeover by another server. 

38. The apparatus of claim 37, wherein said means for determining whether recovery is 
by reboot or takeover by another server is a function of said flag value stored in said 
non- volatile storage. 

39. The apparatus of claim 38, wherein said reboot further comprises: 

means for rebooting the affected server's operating system; and 

means for rebuilding in-memory data structures to the state prior to said reboot. 

40. The apparatus of claim 39, wherein said means for rebuilding in-memory data 
structures further comprises fetching the state stored in said non-volatile storage to 
rebuild said in-memory data structures. 
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1 4LThe apparatus of claim 38, wherein said takeover comprises means for fetching the 

2 state stored in said non-volatile storage and rebuilding said in-memory data structures 

3 in another server using said state. 

4 

42. The apparatus of claim 22, wherein said means for attempting to continue the CIFS 
session that the request was part of further comprises a means for processing the 
remaining portion of the uncompleted request. 



9 43. Non-volatile memory, said non-volatile memory having storage capable of holding 

10 information, said information including; 



5 



11 
12 
13 
14 
15 



18 
19 
20 
21 



information identifying the state of a first device; and 
information identifying a flag value. 



44. The apparatus of claim 43, wherein said flag value is capable of being interpreted to 

16 indicate 

17 rebooting said first device was an elective function; 



rebooting said first device was a non-elective function; 

takeover of said first device by a second device was an elective function; 

and 

takeover of said first device by said second device was a non-elective 



22 function. 

23 
24 
25 
26 
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The invention provides a method and system for re-establishing sessions 
between a server and its clients following a failure of the server, planned reboot of 
the server, or takeover by another server. At critical points within a server/client 
session, state is saved so as to be reliable and consistent. Upon reboot of the 
system, state is restored using that which was saved; returning the server to its pre- 
crash state and preserving sessions that were in progress prior to the reboot. 
Additionally, state saved by a first sever prior to failure or elective shutdown can 
be transferred to a second server in a takeover configuration also preserving 
sessions in progress. 
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